11 research outputs found

    Average Case Analysis of Leaf-Centric Binary Tree Sources

    Get PDF
    We study the average size of the minimal directed acyclic graph (DAG) with respect to so-called leaf-centric binary tree sources as studied by Zhang, Yang, and Kieffer. A leaf-centric binary tree source induces for every n≄2n \geq 2 a probability distribution on all binary trees with nn leaves. We generalize a result shown by Flajolet, Gourdon, Martinez and Devroye according to which the average size of the minimal DAG of a binary tree that is produced by the binary search tree model is Θ(n/log⁥n)\Theta(n / \log n)

    Tunneling on Wheeler Graphs

    Get PDF
    Baier (CPM 2018) describes tunneling as a technique to further exploit redundancies in the Burrows-Wheeler Transform. In this paper we show how to retain indexed text searching on the resulting structure and generalize the concept to Wheeler graphs.Peer reviewe

    Kombinatorische und informationstheoretische Aspekte der Baumkompression

    No full text
    We analyze lossless tree compression algorithms under information-theoretic and combinatorial aspects. One of the most important and widely used compression methods for rooted trees is to represent a tree by its minimal directed acyclic graph, shortly referred to as minimal DAG. The size of the minimal DAG of the tree is the number of distinct fringe subtrees occurring in the tree, where a fringe subtree of a rooted tree is a subtree induced by one of the nodes and all its descendants. In the first part of this work, we study the average number of distinct fringe subtrees (i.e., the average size of the minimal DAG) in random trees. Specifically, we consider the random tree models of leaf-centric binary tree sources, simply generated families of trees and very simple families of increasing trees. In the second part of this work, we analyze grammar-based tree compression via tree straight-line programs (TSLPs) from an information-theoretic point of view. Specifically, we extend the notion of empirical entropy from stings to node-labeled binary trees and plane trees and show that a suitable binary encoding of TSLPs yields binary tree encodings of size bounded by the empirical entropy plus some lower order terms. This generalizes recent results from grammar-based string compression to grammar-based tree compression. In the third part of this work, we present a new compressed encoding of unlabeled binary and plane trees. We analyze this encoding under an information-theoretic point of view by proving that this encoding is universal und thus asymptotically optimal for a great variety of tree sources; this covers in particular the vast majority of tree sources, with respect to which previous tree sources codes were shown to be universal.Wir analysieren verlustfreie Methoden der Baumkomprimierung unter informationstheoretischen und kombinatorischen Gesichtspunkten. Eine weit verbreitete Methode der Baumkomprimierung ist die sogenannte DAG-Komprimierung, bei der ein Baum durch seinen zugehörigen minimalen gerichteten azyklischen Graphen (engl. directed acyclic graph, kurz DAG) dargestellt wird. Die GrĂ¶ĂŸe dieses minimalen DAGs eines Baums ist die Anzahl der verschiedenen fringe subtrees des Baums. Ein fringe subtree eines gewurzelten Baums ist ein Teilbaum, der von einem der Knoten inklusive aller seiner Nachkommen induziert wird. Im ersten Teil dieser Arbeit analysieren wir die erwartete Anzahl der verschiedenen fringe subtrees (d.h., die durchschnittliche GrĂ¶ĂŸe des minimalen DAGs) bzgl. verschiedener Wahrscheinlichkeitsverteilungen auf verschiedenen Baumfamilien. Wir betrachten das Modell der leaf-centric tree sources, das Modell der simply generated families of trees und das Modell der increasing trees. Im zweiten Teil der Arbeit analysieren wir Grammatik-basierte Baumkompression durch sogenannte tree straight-line programs (TSLPs). Wir erweitern den Begriff der empirischen Entropie von Wörtern auf BĂ€ume und zeigen, dass eine geeignete BinĂ€rkodierung von TSLPs binĂ€re Baumkodierungen liefert, deren GrĂ¶ĂŸe in der empirischen Entropie (plus lower-order terms) beschrĂ€nkt ist. Im dritten Teil der Arbeit stellen wir eine neue komprimierte Darstellung von BĂ€umen vor, die universal und daher optimal bezĂŒglich einer großen Anzahl an Baumverteilungen ist; insbesondere gilt dies auch fĂŒr die Mehrzahl der Verteilungen, bezĂŒglich derer fĂŒr bisherige Baumkodierungen UniversalitĂ€t nachgewiesen werden konnte

    Distinct Fringe Subtrees in Random Trees

    No full text
    A fringe subtree of a rooted tree is a subtree induced by one of the vertices and all its descendants. We consider the problem of estimating the number of distinct fringe subtrees in random trees under a generalized notion of distinctness, which allows for many different interpretations of what "distinct" trees are. The random tree models considered are simply generated trees and families of increasing trees (recursive trees, d-ary increasing trees and generalized plane-oriented recursive trees). We prove that the order of magnitude of the number of distinct fringe subtrees (under rather mild assumptions on what 'distinct' means) in random trees with n vertices is n/root log n for simply generated trees and n/ log n for increasing trees

    Universal Tree Source Coding Using Grammar-Based Compression

    No full text
    corecore